This paper presents a survey of elastic matching (EM) techniques employed in handwritten character recognition. EM is often called deformable template, flexible matching, or nonlinear template matching, and defined as the optimization problem of two-dimensional warping (2DW) which specifies the pixel-to-pixel correspondence between two subjected character image patterns. The pattern distance evaluated under optimized 2DW is invariant to a certain range of geometric deformations. Thus, by using the EM distance as a discriminant function, recognition systems robust to the deformations of handwritten characters can be realized. In this paper, EM techniques are classified according to the type of 2DW and the properties of each class are outlined. Several topics around EM, such as the category-dependent deformation tendency of handwritten characters, are also discussed.
Yiping YANG Bilan ZHU Masaki NAKAGAWA
This paper proposes a "structuring search space" (SSS) method aimed to accelerate recognition of large character sets. We divide the feature space of character categories into smaller clusters and derive the centroid of each cluster as a pivot. Given an input pattern, it is compared with all the pivots and only a limited number of clusters whose pivots have higher similarity (or smaller distance) to the input pattern are searched in, thus accelerating the recognition speed. This is based on the assumption that the search space is a distance space. We also consider two ways of candidate selection and finally combine them the method has been applied to a practical off-line Japanese character recognizer with the result that the coarse classification time is reduced to 56% and the whole recognition time is reduced to 52% while keeping its recognition rate as the original.
John GATES Miki HASEYAMA Hideo KITAJIMA
This paper presents a new conic section extraction approach that can extract all conic sections (lines, circles, ellipses, parabolas and hyperbolas) simultaneously. This approach is faster than the conventional approaches with a computational complexity that is O(n), where n is the number of edge pixels, and is robust in the presence of moderate levels of noise. It has been combined with a classification tree to produce an offline character recognition system that is invariant to scale, rotation, and translation. The system was tested with synthetic images and with images scanned from real world sources with good results.
One of the most basic characteristics of the image is accompanied by its blur. It was 1962 that I had discovered for the first time in the world that the blur was a Gaussian type. In this paper the outline is described about historical details concerning this circumstances.
Arit THAMMANO Phongthep RUXPAKAWONG
Many researches have been conducted on the recognition of Thai characters. Different approaches, such as neural network, syntactic, and structural methods, have been proposed. However, the success in recognizing Thai characters is still limited, compared to English characters. This paper proposes an approach to recognize the printed Thai characters using the hybrid of global feature, local features, fuzzy membership function and the neural network. The global feature classifies all characters into seven main groups. Then the local features and the neural network are applied to identify the characters.
Hiroki TAKAHASHI Masayuki NAKAJIMA
In pattern recognition using neural networks, it is very difficult for researchers or users to design optimal neural network architecture for a specific task. It is possible for any kinds of neural network architectures to obtain a certain measure of recognition ratio. It is, however, difficult to get an optimal neural network architecture for a specific task analytically in the recognition ratio and effectiveness of training. In this paper, an evolutional method of training and designing feedforward neural networks is proposed. In the proposed method, a neural network is defined as one individual and neural networks whose architectures are same as one species. These networks are evaluated by normalized M. S. E. (Mean Square Error) which presents a performance of a network for training patterns. Then, their architectures evolve according to an evolution rule proposed here. Architectures of neural networks, in other words, species, are evaluated by another measurement of criteria compared with the criteria of individuals. The criteria assess the most superior individual in the species and the speed of evolution of the species. The species are increased or decreased in population size according to the criteria. The evolution rule generates a little bit different architectures of neural network from superior species. The proposed method, therefore, can generate variety of architectures of neural networks. The designing and training neural networks which performs simple 3 3 and 4 4 pixels which include vertical, horizontal and oblique lines classifications and Handwritten KATAKANA recognitions are presented. The efficiency of proposed method is also discussed.
Mu-King TSAY Keh-Hwa SHYU Pao-Chung CHANG
In this paper, the generalized learning vector quantization (GLVQ) algorithm is applied to design a hand-written Chinese character recognition system. The system proposed herein consists of two modules, feature transformation and recognizer. The feature transformation module is designed to extract discriminative features to enhance the recognition performance. The initial feature transformation matrix is obtained by using Fisher's linear discriminant (FLD) function. A template matching with minimum distance criterion recognizer is used and each character is represented by one reference template. These reference templates and the elements of the feature transformation matrix are trained by using the generalized learning vector quantization algorithm. In the experiments, 540100 (5401 100) hand-written Chinese character samples are used to build the recognition system and the other 540100 (5401 100) samples are used to do the open test. A good performance of 92.18 % accuracy is achieved by proposed system.
Keiji YAMANAKA Susumu KUROYANAGI Akira IWATA
Based on a previous work on handwritten Japanese kanji character recognition, a postprocessing system for handwritten Japanese address recognition is proposed. Basically, the recognition system is composed of CombNET-II, a general-purpose large-scale character recognizer and MMVA, a modified majority voting system. Beginning with a set of character candidates, produced by a character recognizer for each character that composes the input word and a lexicon, an interpretation to the input word is generated. MMVA is used in the postprocessing stage to select the interpretation that accumulates the highest score. In the case of more than one possible interpretation, the Conflict Analyzing System calls the character recognizer again to generate scores for each character that composes each interpretation to determine the final output word. The proposed word recognition system was tested with 2 sets of handwritten Japanese city names, and recognition rates higher than 99% were achieved, demonstrating the effectiveness of the method.
Handprinted Chinese character recognition (HCCR) can be classified into two major approaches: statistical and structural. While neither of these two approaches can lead to a total and practical solution for HCCR, integrating them to take advantages of both seems to be a promising and obviously feasible approach. But, how to integrate them would be a big issue. In this paper, we propose an integrated HCCR system. The system starts from a statistical phase. This phase uses line-density-distribution-based features extracted after nonlinear normalization to guarantee that different writing variations of the same character have similar feature vectors. It removes accurately and efficiently the impossible candidates and results in a final candidate set. Then follows the structural phase, which inherits the line segments used in the statistical phase and extracts a set of stroke substructures as features. These features are used to discriminate the similar characters in the final candidate set and hence improve the recognition rate. Tested by using a large set of characters in a handprinted Chinese character database, the proposed HCCR system is robust and can achieve 96 percent accuracy for characters in the first 100 variations of the database.
Yasushi YAMAZAKI Naohisa KOMATSU
We propose an on-line writer verification method to improve the reliability of verifying a specific system user. Most of the recent research focus on signature verification especially in the field of on-line writer verification. However, signature verification has a serious problem in that it will accept forged handwriting. To overcome this problem, we have introduced a text-indicated writer verification method. In this method, a different text including ordinary characters is used on every occasion of verification. This text can be selected automatically by the verification system so as to reflect the specific writer's personal features. A specific writer is accepted only when the same text as indicated by the verification system is inputted, and the system can verify the writer's personal features from the inputted text. Moreover, the characters used in the verification process can be different from those in the enrolment process. This method makes it more difficult to get away with forged handwriting than the previous methods using only signatures. We also discuss the reliability of the proposed method with some simulation results using handwriting data. From these simulation results, it is clear that this method keeps high reliability without the use of signatures.
Rodney WEBSTER Masaki NAKAGAWA
This paper presents a character recognition method based on a dynamic model, which can be applied to character patterns from both on-line and off-line input. Other similar attempts simply treat on-line patterns as off-line input, while this method makes use of the on-line input's characteristics by representing the time information of handwriting in the character pattern representations. Experiments were carried out on the Hiragana character set. Without non-linear normalization, this method achieved recognition rates of 92.3% for on-line input and 89.1% for off-line input. When non-linear normalization is used, there is an increase in performance for both types of input with on-line input achieving 94.5% and off-line input achieving 94.1%. The reason for the difference in the effectiveness of non-linear normalization on off-line and on-line patterns could be that while the method used for off-line input was an established and proved one, we used our own initial attempt at non-linear normalization for the on-line patterns. If the same level of effectiveness of non-linear normalization as off-line input is achieved on the on-line input, however, the recognition rate for on-line input again improves becoming 96.3%. Since only one standard pattern was used per category for the dictionary patterns, the above results show the promise of this method. This result shows the compatibility of this method to both on-line and off-line input, as well as its effective use of on-line input's characteristics. The effectiveness of this use of the time information is shown by using an actual example. The data also shows the need for a method of non-linear normalization which is more suitable for on-line input.
Fang SUN Shin'ichiro OMACHI Hirotomo ASO
In this paper, a new algorithm for selection of candidates for handwritten character recognition is presented. Since we adopt the concept of the marginal radius to examine the confidence of candidates, the evaluation function is required to describe the pattern distribution correctly. For this reason, we propose Simplified Mahalanobis distance and observe its behavior by simulation. In the proposed algorithm, first, for each character, two types of feature regions (multi-dimensional one and one-dimensional one) are estimated from training samples statistically. Then, by referring to the feature regions, candidates are selected and verified. Using two types of feature regions is a principal characteristic of our method. If parameters are estimated accurately, the multi-dimensional feature region is extremely effective for character recognition. But generally, estimation errors in parameters occur, especially with a small number of sample patterns. Although the recognition ability of one-dimensional feature region is not so high, it can express the distribution comparatively precisely in one-dimensional space. By combining these feature regions, they will work concurrently to overcome the defects of each other. The effectiveness of the method is shown with the results of experiments.
Kazuki SARUTA Nei KATO Masato ABE Yoshiaki NEMOTO
In earlier works we proposed the Exclusive Learning neural NET work (ELNET), which can be utilized to construct large scale recognition system for Chinese characters. However, this did not resolve the problem of how to use training samples effectively to generate more suitable recognition boundaries. In this paper, we propose ELNET- wherein an attempt has been made to deal with this problem. In comparison with ELNET, selection method of training samples is improved. And the number of module size are variable according to the number of training samples for each module. In recognition experiment for ETL9B (3036 categories) using ELNET-, we obtained a recognition rate of 95.84% as maximum recognition rate. This is the first time that such a high recognition rate has been obtained by neural networks.
Hiroki MORI Hirotomo ASO Shozo MAKINO
A new postprocessing method using interpolated n-gram model for Japanese documents is proposed. The method has the advantages over conventional approaches in enabling high-speed, knowledge-free processing. In parameter estimation of an n-gram model for a large size of vocabulary, it is difficult to obtain sufficient training samples. To overcome poverty of samples, two smoothing methods for Japanese character trigram model are evaluated, and the superiority of deleted interpolation method is shown by using perplexity. A document recognition system based on the trigram model is constructed, which finds maximum likelihood solutions through Viterbi algorithm. Experimental results for three kinds of documents show that the performance is high when using deleted interpolation method for smoothing. 90% of OCR errors are corrected for the documents similar to training text data, and 75% of errors are corrected for the documents not so similar to training text data.
Satoshi NAOI Misako SUWA Maki YABUKI
The global interpolation method we proposed can extract a handwritten alpha-numeric character pattern even if it overlaps a border. Our method interpolates blank segments in a character after borders are removed by evaluating segment pattern continuity and connectedness globally to produce characters with smooth edges. The main feature of this method is to evaluate global component label connectivity as pattern connectedness. However, it is impossible for the method to interpolate missing superpositioning loop segments, because they lack segment pattern continuity and they have already had global component label connectivity. To solve this problem, we improved the method by adding loop interpolation as a global evaluation. The evaluation of character segment continuity is also improved to achieve higher quality character patterns. There is no database of overlapping characters, so we also propose an evaluation method which generates various kinds of overlapping numerals from an ETL database. Experimental results using these generated patterns showed that the improved global interpolation method is very effective for numbers that overlap a border.
Toru WAKAHARA Akira SUZUKI Naoki NAKAJIMA Sueharu MIYAHARA Kazumi ODAKA
This paper describes an on-line Kanji character recognition method that solves the one-to-one stroke correspondence problem with both the stroke-number and stroke-order variations common in cursive Japanese handwriting. We propose two kinds of complementary algorithms: one dissolves excessive mapping and the other dissolves deficient mapping. Their joint use realizes stable optimal stroke correspondence without combinatorial explosion. Also, three kinds of inter-stroke distances are devised to deal with stroke concatenation or splitting and heavy shape distortion. These new ideas greatly improve the stroke matching ability of the selective stroke linkage method reported earlier by the authors. In experiments, only a single reference pattern for each of 2,980 Kanji character categories is generated by using training data composed of 120 patterns written carefully with the correct stroke-number and stroke-order. Recognition tests are made using the training data and two kinds of test data in the square style and in the cursive style written by 36 different people; recognition rates of 99.5%, 97.6%, and 94.1% are obtained, respectively. Moreover, comparative results obtained by the current OCR technique as applied to bitmap patterns of on-line character data are presented. Finally, future work for enhancing the stroke matching approach to cursive Kanji character recognition is discussed.
Hidetoshi MIYAO Yasuaki NAKANO
In the traditional note symbol extraction processes, extracted candidates of note elements were identified using complex if-then rules based on the note formation rules and they needed subtle adjustment of parameters through many experiments. The purpose of our system is to avoid the tedious tasks and to present an accurate and high-speed extraction of note heads, stems and flags according to the following procedure. (1) We extract head and flag candidates based on the stem positions. (2) To identify heads and flags from the candidates, we use a couple of three-layer neural networks. To make the networks learn, we give the position informations and reliability factors of candidates to the input units. (3) With the weights learned by the net, the head and flag candidates are recognized. As an experimental result, we obtained a high extraction rate of more than 99% for thirteen printed piano scores on A4 sheet which have various difficulties. Using a workstation (SPARC Station 10), it took about 90 seconds to do on the average. It means that our system can analyze piano scores 5 times or more as fast as the manual work. Therefore, our system can execute the task without the traditional tedious works, and can recognize them quickly and accurately.
This paper describes advances in the study of handwritten Kanji character recognition mainly performed in Japan. The research focus has shifted from the investigation of the possibility of recognition by the stroke structure analysis method to the study of the feasibility of recognition by the feature matching methods. A great number of features and their extraction methods have been proposed according to this approach. On the other hand, studies on pattern matching methods of recognizing Kanji characters using the character pattern itself have been made. The research efforts based on these two approaches have led to the empirical fact that handwritten Kanji character recognition would become more effective by paying greater attention to the feature of directionality. Furthermore, in an effort to achieve recognition with higher precision, active research work has been carried out on pre-processing techniques, such as the forced reshaping of input pattern, the development of more effective features, and nonlinear flexible matching algorithms. In spite of these efforts, the current character recognition techniques represent only a skill of guessing characters" and are still on an insufficient technical level. Subsequent studies on character recognition must address the question of how to understand characters".
Most conventional methods used in character recognition extract geometrical features, such as stroke direction and connectivity, and compare them with reference patterns in a stored dictionary. Unfortunately, geometrical features are easily degraded by blurs and stains, and by the graphical designs such as used in Japanese newspaper headlines. This noise must be removed before recognition commences, but no preprocessing method is perfectly accurate. This paper proposes a method for recognizing degraded characters as well as characters printed on graphical designs. This method extracts features from binary images, and a new similarity measure, the complementary similarity measure, is used as a discriminant function; it compares the similarity and dissimilarity of binary patterns with reference dictionary patterns. Experiments are conducted using the standard character database ETL-2, which consists of machine-printed Kanji, Hiragana, Katakana, alphanumeric, and special characters. The results show that our method is much more robust against noise than the conventional geometrical-feature method. It also achieves high recognition rates of over 97% for characters with textured foregrounds, over 99% for characters with textured backgrounds, over 98% for outline fonts and over 99% for reverse contrast characters. The experiments for recognizing both the fontstyles and character category show that it also achieves high recognition rates against noise.
Hideaki YAMAGATA Hirobumi NISHIDA Toshihiro SUZUKI Michiyoshi TACHIKAWA Yu NAKAJIMA Gen SATO
Handwritten character recognition has been increasing its importance and has been expanding its application areas such as office automation, postal service automation, automatic data entry to computers, etc. It is challenging to develop a handwritten character recognition system with high processing speed, high performance, and high portability, because there is a trade-off among them. In current technology, it is difficult to attain high performance and high processing speed at the same time with single algorithms, and therefore, we need to find an efficient way of combination of multiple algorithms. We present an engineering solution to this problem. The system is based on multi-stage strategy as a whole: The first stage is a simple, fast, and reliable recognition algorithm with low substitution-error rate, and data of high quality are recognized in this stage, whereas sloppily written or degraded data are rejected and sent out to the second stage. The second stage is composed of a sophisticated structural pattern classifier and a pattern matching classifier, and these two complementary algorithms run in parallel (multiple expert approach). We demonstrate the performance of the completed system by experiments using real data.